Extracting Salient Keywords from Instructional Videos Using Joint Text, Audio and Visual Cues

نویسندگان

Youngja Park

Ying Li

چکیده

This paper presents a multi-modal featurebased system for extracting salient keywords from transcripts of instructional videos. Specifically, we propose to extract domain-specific keywords for videos by integrating various cues from linguistic and statistical knowledge, as well as derived sound classes and characteristic visual content types. The acquisition of such salient keywords will facilitate video indexing and browsing, and significantly improve the quality of current video search engines. Experiments on four government instructional videos show that 82% of the salient keywords appear in the top 50% of the highly ranked keywords. In addition, the audiovisual cues improve precision and recall by 1.1% and 1.5% respectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Automated Video Classification and Annotation Using Embedded Audio for Content Based Retrieval

Efficient and effective video classification and annotation demands automated unsupervised classification and annotation of videos based on its embedded video content as manual indexing is unfeasible. Audio is a rich source of information in the digital videos that can provide useful descriptor for indexing the video databases. Audio archives contrast with image or video archives in a number of...

متن کامل

Finding “It”: Weakly-Supervised Reference-Aware Visual Grounding in Instructional Videos

Grounding textual phrases in visual content with standalone image-sentence pairs is a challenging task. When we consider grounding in instructional videos, this problem becomes profoundly more complex: the latent temporal structure of instructional videos breaks independence assumptions and necessitates contextual understanding for resolving ambiguous visual-linguistic cues. Furthermore, dense ...

متن کامل

Detection of slide transition for topic indexing

This paper presents an automatic and novel approach in detecting the transitions of slides for video sequences of technical lectures. Our approach adopts a foreground vs background segmentation algorithm to separate a presenter from the projected electronic slides. Once a background template is generated, text captions are detected and analyzed. The segmented caption regions as well as backgrou...

متن کامل

Improving Precision of Keywords Extracted From Persian Text Using Word2Vec Algorithm

Keywords can present the main concepts of the text without human intervention according to the model. Keywords are important vocabulary words that describe the text and play a very important role in accurate and fast understanding of the content. The purpose of extracting keywords is to identify the subject of the text and the main content of the text in the shortest time. Keyword extraction pl...

متن کامل

Components in Dynamic Video Content

The fast expansion of Internet and DVB channels has brought a fast increase of video footage which needs to be indexed for efficient and easy retrieval. This task has been historically done by documentalists who tag manually each video with a few keywords, unfortunately such work is time consuming and hence very expensive. In the last decade much effort has been put into building processes whic...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2006

Extracting Salient Keywords from Instructional Videos Using Joint Text, Audio and Visual Cues

نویسندگان

چکیده

منابع مشابه

An Automated Video Classification and Annotation Using Embedded Audio for Content Based Retrieval

Finding “It”: Weakly-Supervised Reference-Aware Visual Grounding in Instructional Videos

Detection of slide transition for topic indexing

Improving Precision of Keywords Extracted From Persian Text Using Word2Vec Algorithm

Components in Dynamic Video Content

عنوان ژورنال:

اشتراک گذاری